PREDICT validity for prognosis of breast cancer patients with pathogenic BRCA1/2 variants

We assessed the PREDICT v 2.2 for prognosis of breast cancer patients with pathogenic germline BRCA1 and BRCA2 variants, using follow-up data from 5453 BRCA1/2 carriers from the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA) and the Breast Cancer Association Consortium (BCAC). PREDICT for estrogen receptor (ER)-negative breast cancer had modest discrimination for BRCA1 carrier patients overall (Gönen & Heller unbiased concordance 0.65 in CIMBA, 0.64 in BCAC), but it distinguished clearly the high-mortality group from lower risk categories. In an analysis of low to high risk categories by PREDICT score percentiles, the observed mortality was consistently lower than the expected mortality, but the confidence intervals always included the calibration slope. Altogether, our results encourage the use of the PREDICT ER-negative model in management of breast cancer patients with germline BRCA1 variants. For the PREDICT ER-positive model, the discrimination was slightly lower in BRCA2 variant carriers (concordance 0.60 in CIMBA, 0.65 in BCAC). Especially, inclusion of the tumor grade distorted the prognostic estimates. The breast cancer mortality of BRCA2 carriers was underestimated at the low end of the PREDICT score distribution, whereas at the high end, the mortality was overestimated. These data suggest that BRCA2 status should also be taken into consideration with tumor characteristics, when estimating the prognosis of ER-positive breast cancer patients.

The online PREDICT tool for estimating breast cancer patient prognosis has been widely adopted by clinicians during the past decade 1,2 . The algorithm for expected mortality for up to 10 years after breast cancer diagnosis has been validated in patient cohorts from Western Europe, North America, and South-East Asia [3][4][5][6][7][8] . PREDICT handles estrogen receptor (ER)-positive and ER-negative breast cancers as distinct disease entities 2 . In either case, PREDICT estimates the prognosis according to a baseline hazard function and a proportional prognostic score, based on diagnosis age and tumor characteristics, such as size and grade, ki67 and HER2 expression, and the number of affected lymph nodes. Furthermore, the progesterone receptor expression (PgR) will be incorporated in the score in the near future 9 . In addition to expected mortality, PREDICT estimates the absolute benefit from multiple treatment lines, including adjuvant endocrine therapy, 2nd or 3rd generation chemotherapy, trastuzumab, or bisphosphonates.
Pathogenic variants in BRCA1 and BRCA2 confer a high lifetime risk of breast cancer and increased risk of ovarian cancer 10 . The BRCA1 carrier breast tumors are characteristically triple-negative, high-grade carcinomas, whereas BRCA2 carrier tumors are most often positive for estrogen receptor expression (ER-positive). The BRCA1/2 carriers are diagnosed at a younger age when compared to non-carriers 11 , and the typical BRCA1/2-associated tumor characteristics are enriched in the younger age groups 12,13 . The overall survival rate of breast cancer patients with pathogenic BRCA1/2 variants is lower than the survival of non-carriers 14,15 . However, the difference may be largely explained by differences in tumor pathology and incidence of secondary ovarian cancer [16][17][18] . Intriguingly, some studies have suggested that the effects associated with the conventional pathological prognostic factors could be opposite in BRCA1/2 carriers and non-carriers. For example, decreased survival of BRCA1/2 carriers with ER-positive breast cancer has been reported in several studies 16,[19][20][21][22] . Furthermore, the relevance of tumor grade as a prognostic factor for BRCA1/2 has been questioned repeatedly 22,23 .
We have tested PREDICT model in retrospective follow-up data from BRCA1/2 carrier patients from the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA) and the Breast Cancer Association Consortium (BCAC).
A full list of author affiliations appears at the end of the paper.

RESULTS
Follow-up data was available for 2892 BRCA1 and 1813 BRCA2 variant carriers from CIMBA, and for 316 BRCA1 and 432 BRCA2 variant carriers from the BCAC. The pathology data was partially missing for many patients, but Multiple Imputation with Chained Equations (MICE, Supplementary Table 1) allowed inclusion of all patients with follow-up data. Multiple imputation requires that all statistical analyses are performed in parallel in the imputed datasets and that the analysis outputs are pooled for a final result. In the following, we use the term 'pooled' in this connotation. We performed all analyses separately for the ER-negative and ERpositive patient groups (Table 1), corresponding to the specific PREDICT models for ER-negative and ER-positive breast cancer and the characteristic tumor phenotypes of the BRCA1 and BRCA2 variant carriers, respectively. PREDICT scores and the expected breast cancer-associated mortality were calculated according to algorithm v 2.2., including variables for diagnosis age, tumor grade and size, lymph node and HER2 status, adjuvant therapy, and further adjusted for progesterone receptor expression 2,9,24,25 . The analyses included estimating the model discrimination, re-fitting the prognostic factors in a Cox regression model with the full score as an offset, and measuring the model calibration.

ER-negative PREDICT
The ER-negative PREDICT score was able to discriminate the better and worse surviving BRCA1 carriers with ER-negative breast cancer. In a study-stratified analysis of the CIMBA BRCA1 carriers, the Gönen & Heller unbiased concordance for the PREDICT score with 15-year follow-up was 0.647, whereas in the analysis of the BCAC BRCA1 carriers the concordance was 0.637 and in the analysis of the CIMBA BRCA2 carriers 0.568. However, the model discrimination was slightly better when follow-up was restricted to the first five or ten years after the diagnosis ( Table 2). The Gönen & Heller unbiased concordance derives the concordance probability directly from the Cox regression model. It is not dependent on uninterrupted follow-up and is therefore more reliable than AUC statistic for estimating discrimination in censored survival data. A concordance value of 0.50 suggests that a model is as good as a random guess and value 1.0 implies perfect prediction. The Kaplan-Meier graphs of patient survival at discrete risk levels provide visual evidence on the discriminatory potential of PREDICT for ER-negative breast cancer in BRCA1 carriers (Fig. 1). We found no significant residual hazard associated with any of the tumor characteristics, on top of the ER-negative PREDICT score (Supplementary Table 2). Furthermore, a graphical examination of a spline of age-related hazard in the CIMBA BRCA1 carriers suggested that the age-factor in the ER-negative PREDICT model fits well with the observed survival data ( Supplementary Fig. 1).
The ER-negative PREDICT-algorithm overestimated breast cancer mortality in all BRCA1/2 patient groups with ER-negative breast cancer from CIMBA and BCAC (Table 3). The pooled expected mortality was outside the 95% confidence interval of the pooled observed mortality when examining either all BRCA1 or all BRCA2 patients together (Table 3, first and two last rows). Consistently, when calibration was tested in CIMBA patient subgroups dichotomized by tumor size, grade, HER2 expression, node status, or in three distinct age categories, the expected mortality was higher than the observed mortality (Table 3). A calibration plot of low-to-high PREDICT percentiles in the BRCA1 carriers with ERnegative breast cancer suggested a mild but consistent overestimation of 10-year mortality, with good separation of the middle-high (50-80%ile) and high (80-100%ile) mortality categories from middle-low (20-50%ile) and low categories (0-20%ile) (Fig. 2a). However, the difference between expected and observed mortality was slightly alleviated with a longer, 15-year, follow-up time ( Supplementary Fig. 2).
In summary, the PREDICT score predicted survival with modest precision in the ER-negative BRCA1 carrier patients, although it tended to overestimate mortality throughout all risk levels. The prognostic impact of the individual risk factors in the PREDICT model did not deviate significantly from those of the PREDICT algorithm and the high-risk patients were identified well. Thus, the PREDICT model estimated the mortality risk in ER-negative BRCA1 carriers with moderate accuracy, whereas for ER-negative BRCA2carriers the analysis was indecisive, due to small cohort size.

ER-positive PREDICT score
The ability of the PREDICT ER-positive model to discriminate BRCA1/2 carriers was quite low in the CIMBA data, with Gönen & Heller concordance 0.601 for BRCA2 carriers and 0.551 for BRCA1 carriers, for follow-up time of 15 years after diagnosis, and equally poor for shorter follow-up of 10 years (Table 4). This was evident also in a modest separation of the Kaplan-Meier curves of BRCA1/2 carriers with ER-positive breast cancer in different PREDICT percentile-based risk categories (Fig. 3). However, in the smaller dataset of BRCA2 carriers from BCAC, the discrimination was higher, ranging from 0.665 for 5-year follow-up to 0.648 for 15year follow-up (Table 4).
When the pathologic factors, included in the PREDICT score, were refitted in a Cox regression model with the PREDICT score as an offset, to explain the survival of the CIMBA BRCA2 carriers, the tumor grade had significant residual hazard in opposite direction to the coefficients embedded in the PREDICT (Table 5). A marginal residual hazard to opposite direction was seen also for PgR status, tumor size, and the number of affected lymph nodes, suggesting an overall poor fit of the PREDICT ER-positive score for the BRCA2 carriers from CIMBA. When grade was removed from the PREDICT score, and included as an independent categorical covariate in a Cox regression model, offsetting with the reduced score, no significant difference was associated with either grade 3 or grade 1 when compared to grade 2 (Supplementary Table 3). Consequently, excluding grade from the PREDICT score improved the score concordance in CIMBA BRCA2 carriers from 0.601 to 0.610, but also in BCAC BRCA2 carriers from 0.648 to 0.658, suggesting that the tumor grade have little value in the prognosis of BRCA2 carriers. A similar trend was seen also when restricting the followup time to ten years after diagnosis ( Table 4).
The PREDICT ER-positive score includes a non-linear component for diagnosis age, with steeply increasing hazard for ages younger than 40 years, and moderately increasing hazard for ages above 50 years. The relative hazard associated with diagnosis age had a milder curve in the CIMBA BRCA2 carriers, when modeled with a spline. However, the PREDICT estimate was within the 95% confidence interval of the spline across ages 20 to 70 years ( Supplementary Fig. 3).
The overall 10-year observed mortality of the BRCA1/2 carriers with ER-positive breast cancer did not differ significantly from the PREDICT point estimate of expected mortality, either in data from CIMBA or BCAC (Table 6). However, a calibration plot of low to high risk categories of PREDICT percentiles (0-20%ile, 20-50%ile, 50-80%ile, 80-100%ile), suggested that PREDICT underestimated BRCA2 carrier 10-year mortality in the lower risk categories, whereas in the high risk category, the observed mortality was significantly lower than the expected mortality (Fig. 2b). A longer follow-up time of 15-years did not affect the pattern (Supplementary Fig. 4). When examined in subgroups dichotomized by tumor pathology, the patients with grade 3 or node-positive tumors had lower mortality than expected, but patients with either grade 2 or node-negative tumors had higher mortality than expected (Table 6).
In summary, the accuracy of the PREDICT score for estimating the survival in ER-positive patients was lower than the accuracy in the ER-negative population. Although the PREDICT model estimated the average survival in the whole ER-positive patient population with moderate accuracy, the model did not reliably discriminate the low-and high-risk groups. Especially, the prognostic impact of the tumor grade deviated highly significantly from the PREDICT model, possibly reflecting underlying differences in the impact of tumor grade on prognosis in BRCA1/2 carriers when compared to the patient populations on which the PREDICT model is based. In fact, the survival of BRCA2 carriers with grade 3 tumors was similar to survival of BRCA2 carriers with grade 2 tumors. Thus, the accuracy of the PREDICT model for estimating mortality risk in ER-positive BRCA1-or BRCA2-carriers was suboptimal.

DISCUSSION
The primary motivation of PREDICT has been to provide a tool for clinicians to numerically estimate the benefit from adjuvant therapy. The relative benefit from adjuvant therapy is similar at all risk levels, but the absolute benefit is higher for patients at high risk of recurrence or cancer-associated death, making the risk of adverse side effects more acceptable in this group 2 . The algorithm was trained on a prospective population-based cohort from the UK, but multiple validation studies indicate that PREDICT gives reliable estimates also in many other populations 6,8,26 , despite significant differences in the baseline survival rates between countries 27 . Our analyses suggest, that the PREDICT ER-negative model is equally valid for management of BRCA1/2 variant carriers with ER-negative breast cancer, but sub-optimal for estimating the prognosis of ERpositive breast cancer. Previous validation analyses of PREDICT version 2 have measured the discrimination with AUC (area under curve) -statistics, ranging from 0.696 to 0.75 for the ER-negative model 2,8 . The concordance in the BRCA1 carrier data was lower: 0.65 in data from CIMBA and 0.64 in data from BCAC, but sufficient to discriminate especially the poor survival group of the BRCA1 patients (Fig. 2).
Despite the good discrimination, PREDICT seemed to overestimate the risk of breast cancer-specific death. The difference between expected and observed mortality was about 7-8 percentage points ten years after diagnosis, but decreased with a longer follow-up time of 15 years (  Supplementary Fig. 2). Van Maaren et al. previously reasoned that a difference of this magnitude has clinical impact, because it is sufficiently large to affect the treatment choice, whether to administer adjuvant chemotherapy 8 . However, over-estimating mortality is less detrimental than underestimating, because it does not risk the access to a sufficiently efficient adjuvant therapy. Of the CIMBA BRCA1 carrier patients with ER-negative breast cancer, who had adjuvant therapy recorded in the data (none/any), about 90% had received adjuvant chemotherapy, even at the lowest risk category (PREDICT 0-20%ile). A beneficial treatment response is one possible explanation for the difference between expected and observed mortality, even though the expected benefit from adjuvant therapy was embedded in the PREDICT score. The difference may also have arisen from the imputation process. M-status was missing for a substantial number of patients (Table  1). Filtering the patients with imputed M-status may have caused loss of early events. However, the expected-observed difference was equally large in BCAC, where the M-status was more frequently available. Thus, this does not appear as a major source of bias, though it warrants caution in interpretation. Furthermore, the expected-observed difference is in keeping with a recent study, where the survival of BRCA1 carriers breast cancer was nominally higher than survival of non-carriers in pathology-and treatment-adjusted analysis of patients with ER-negative breast cancer 18 . BRCA2 variant carrier cancers are characteristically ER-positive. However, a recent study suggested that germline BRCA2 variants increase also the risk of triple-negative breast cancer, which is generally considered a poor-prognosis breast cancer subtype 13 . Our analyses on PREDICT in BRCA2 carriers with ER-negative breast cancer were indecisive. The discrimination was low (0.568, Table  2), and breast cancer-associated survival good, with lower than expected mortality, similarly to the BRCA1 carriers with ERnegative breast cancer (Table 3, Fig. 1). Due to low number of patients with grade 1 breast cancer (see Table 1), this subgroup was not separately analyzed for calibration.    In previous studies, validating the PREDICT model in cohorts of unselected breast cancer patients, the discrimination of the ERpositive model has consistently been higher than the discrimination of the ER-negative model, with AUC-statistics between 0.74 and 0.79 2,8 . In that respect, the PREDICT concordance of 0.60 in the CIMBA BRCA2 carriers with ER-positive breast cancer appeared strikingly low.
The offset-and the calibration-analyses indicated that especially the tumor grade appeared to confuse the PREDICT ER-positive score, when predicting the BRCA2 variant carrier survival, whereas the factors related to the stage of malignant progression, like tumor size and node involvement, retained their predictive potential. These observations were made in the CIMBA data, but as omitting grade from the score improved its discrimination also in the BCAC data, we can conclude that the same trend is present also there. Earlier studies have suggested that the survival of BRCA2 carrier patients does not vary by tumor grade, after other pathologic factors have been taken into account [21][22][23] . In our analysis, where tumor grade was an independent covariate, the hazard associated with grade 3 in comparison to grade 2 was nominally lower, with a pooled P-value close to the significance threshold. However, in this kind of retrospective data, the observed survival differences cannot be separated from the treatment choice. Grade 3 BRCA2 carriers had received more often adjuvant chemotherapy or combined chemo-endocrine therapy than grade 2 patients (Supplementary Table 4), and the underlying differences in therapeutic practices for grade 2 and grade 3 ER-positive cancers may have contributed to the nominally lower survival of the grade 2 patients.
The overall calibration of the PREDICT in BRCA2 carriers with ERpositive breast cancer was good. However, the calibration varied by the magnitude of the PREDICT score, the observed mortality being higher than the expected, especially in the lower-risk groups ( Table 6, Fig. 2b). A recent BCAC study, comparing the BRCA1/2 carrier survival to survival of population matched noncarriers, found BRCA2 pathogenic variants to be associated with decreased patient survival after ER-positive breast cancer 18 . Our analyses suggest that this difference would be emphasized in patient groups with milder clinical characteristics. Therefore, the PREDICT model does not appear well-suited for the management of BRCA2 carriers with ER-positive breast cancer. Similarly, the concordance of 0.55 does not provide much support for the PREDICT model in the management of BRCA1 variant carriers with ER-positive breast cancer, either.
As the purpose of the PREDICT is to aid in the decision on adjuvant therapy, the fundamental question in our study was, whether the BRCA1/2 carriers could be managed the same way as non-carriers, and especially, is the BRCA1/2 carrier status such vital information, that genotyping the patients prior to therapy would be advisable. Strong family history of breast and ovarian cancer indicates high likelihood of germline BRCA1 or BRCA2 pathogenic variant. However, genotyping to explore the causes of the familial risk may take place only after the management of the proband's primary cancer. Furthermore, not all carriers have such family structure or records that would reveal the increased hereditary risk. Therefore, it's likely that many variant carriers with breast cancer are treated without knowledge about the carrier status. The characteristic mutational signature of the BRCA1/2 variant carrier cancer, homologous recombination deficiency 28 , makes the cancers responsive to platinum-based therapy or PARP-  inhibitors 29,30 , but most of the carriers are still treated according to standard indications 31 . Retrospective analyses have suggested that the benefit from the standard adjuvant chemotherapy regimens are similar for BRCA1/2 carriers and non-carriers, but the benefit from adjuvant endocrine therapy is limited 32,33 . Instead, oophorectomy has recently been suggested to reduce breast cancer recurrence and mortality of both BRCA1 and BRCA2 variant carriers [32][33][34] . In our study, the breast cancer-associated mortality of BRCA2 carriers with ER-positive breast cancer was higher than expected in a patient group, where adjuvant chemotherapy was less-frequently used, but lower than expected in the high-risk patient group where adjuvant chemotherapy was used more often (Supplementary Table 4). The strengths of this study include a large number of cases with pathogenic BRCA1/2 germline variants and a stratified analysis of multiple cohorts from Europe, Northern America, and Australia. The study limitations include late recruitment of some patients and notable proportion of missing pathology and treatment data. To alleviate these shortcomings, the collected data has been harmonized and curated. Especially, we ensured that the number of patients under observations right after diagnosis was sufficiently high for an unbiased survival analysis. Furthermore, we applied statistical methods, like multiple imputation and left truncation to achieve robust conclusions. However, we were not able to address all nuances related to breast cancer diagnosis and management, like the presence of micrometastases, the duration of endocrine therapy, or administration of neoadjuvant chemotherapy. It's worth noting that the PREDICT was not trained with a cohort that would have included patients treated with neoadjuvant therapy. We run a sensitivity analysis to exclude cases with known or imputed neoadjuvant chemotherapy. The results were essentially similar to the results of the main analyses, supporting the conclusions presented above.
The PREDICT ER-negative model gives reliable estimates, but the ER-positive model is less well-suited for BRCA1/2 carriers. Especially, our analyses indicate BRCA2 carriers a specific group of breast cancer patients, for whom the conventional prognostic estimation is not well-suited. Altogether, our findings encourage including the information on germ-line pathogenic BRCA1/2 variants into the decision making for adjuvant therapy regimens of breast cancer patients.

Study subjects
The study subjects included female breast cancer patients of European ethnic origin enrolled into studies participating in the CIMBA (Table 7). For these analyses, the BRCA1/2 carrier patients were considered eligible, if they were diagnosed with primary breast cancer under the age of 70 years, at 1990 or later, and had the following data available: follow-up time after the first invasive breast cancer diagnosis, status (dead/alive) at the end of followup, time of DNA sample collection, diagnosis age, and diagnosis year. Study subjects with previous ovarian cancer diagnosis or those included in the BCAC studies (see below) were excluded from CIMBA. This yielded data from 2892 BRCA1, 1813 BRCA2 pathogenic variant carriers with breast cancer. The number of patients under observation right after diagnosis was 836, reached maximum, 2066, at about 4 years after diagnosis, steadily decreasing to 800 under observation 15 years after diagnosis.
Separate validation was performed in an independent set of BRCA1/2 variant carriers from the BCAC. The variant carrier status was confirmed in gene panel sequencing as a part of the BRIDGES project 35 . Patients with BRCA1/2 pathogenic or likely pathogenic variants (class 4 and 5) were included in the analyses 36  The study was compliant with the Helsinki declaration. All participating studies were approved by their appropriate institutional review boards (Table 7), following their national guidelines for informed consent. The details on study-wise informed consent policies are provided in Table 7.
Phenotype data All available pathology, treatment, and follow-up data were retrieved from the consortium databases (CIMBA database version 2016, BCAC database release 13). Since these data were incomplete, we applied Multiple Imputation by Chained Equations (MICE) for imputation of the missing values, so that we were able to calculate the PREDICT scores for all patients with available survival data. We assumed that due to the complex relations between the variables, a maximally large sample of observed data would provide the best foundation for imputation. Therefore, additional data from 2138 BRCA1/2 carriers from CIMBA as well as from 126 BRCA1/2 carriers and 32912 non-carriers (including BRCA1/2 variants of unknown significance) from BCAC were included to support the imputation process (Table 7). Data management and statistical analyses were performed with R environment for statistical computing, version 4.0.0 37 .
We imputed missing data into 50 parallel data matrices with R library mice 38 . The pathology, treatment, and follow-up data from CIMBA and BCAC were harmonized in terms of variable names, types, and coding, and then combined. A Nelson-Aalen estimate of cumulative hazard of overall and breast cancer-specific survival until the end of follow-up time was calculated for all patients with available follow-up time and used in the imputation process. The Nelson-Aalen estimator for breast cancer-specific survival was defined on the basis of studies, which provided the cause of death (BC/other) for at least 80% of deceased patients. The prediction matrix, defining the relations of the imputed features, was initiated by pairwise correlation between the features, defined as Spearman rank correlation >0.125, and further modified as follows. Mutual prediction was forced between ER-status and PRstatus, as well as ER-status and tumor morphology. Diagnosis year was not allowed to predict HER2-status. The tumor size category was predicted by correlated features, but tumor size in mm (logscale) was predicted only by the size category (Supplementary  Table 1). Data was post-processed so that adjuvant therapy subtypes were not positive, if the main type (chemo-or endocrine therapy) was not positive. Trastuzumab treatment was not allowed before year 1997. The imputed data was checked by cross-tabulation, to assure that the relations between covariables and the BRCA1/2 specific features were retained.

PREDICT scores
The PREDICT scores were calculated according to the functions presented at the PREDICT website (https://breast.predict.nhs.uk/ legal/algorithm, accessed 2021-09-17), including coefficients or functions for diagnosis age, tumor grade (1, 2, 3), tumor largest diameter in mm, positive lymph node count, HER2 status (dichotomous), and fixed coefficients for adjuvant chemo-and endocrine therapy 2,24,25 . Furthermore, the coefficients associated with positive progesterone expression was added as suggested in Grootes et al. 9 . Patients with positive M-status (metastasis at diagnosis) were excluded after multiple imputation, since PREDICT is not applicable for M1 patients. The M-status was missing for a very high proportion of patients (Table 1) and the least biased approach in the context of multiple imputation is to filter the data only after the imputation process. Ki67 data was not available, and T.A. Muranen et al.    the corresponding coefficients were excluded from the score. The expected breast cancer mortality was calculated based on the PREDICT scores and baseline risks for breast cancer and other cause mortality 3 .

Statistical analysis
The analyses were performed in parallel in the 50 imputed datasets, and the final results, e.g., regression model coefficients, model-based predictions, concordance, and expected or observed mortality were pooled according to Rubin's rules or as recommended for event history analysis in Marshall et al. 39,40 . The PREDICT risk categories, defined by PREDICT score percentiles were pooled by voting-the pooled category for a patient was the category, which the patient received most frequently in the 50 imputed datasets. Survival analyses were performed with R library survival 41,42 . The 15-year follow-up started at the first breast cancer diagnosis, and left-truncation was applied to account for delayed entry. Patients were censored at the end of follow-up, if lost from follow-up, or at non-breast-cancer-related death. The PREDICT ER-negative and ER-positive scores were tested separately in the corresponding subgroups of BRCA1 and BRCA2 carrier patients from CIMBA and BCAC. First, the PREDICT score was tested as a linear covariate in a Cox regression model, using the model concordance as a measure of the model fit. The Gönen & Heller unbiased concordance was estimated using R library CPE 43 and pooled to median. Second, the PREDICT score was used as an offset factor in a Cox regression model, where all the score components were included as independent covariates. Here, the diagnosis age was modeled with a spline with three degrees of freedom, grade (1,2,3) and the number of affected lymph nodes as numerical variables, tumor size (mm) as a log-scale linear variable, and PgR and Her2 statuses as dichotomous variables (positive vs. negative status). If any of the covariates was associated with significant residual hazard, a reduced score, excluding these covariates, was calculated. The reduced score was then used as an offset, and the hazard associated with these covariates was estimated with a multivariable Cox regression. All Cox regression models were stratified by country, to account for differences in the baseline risk due to differences in treatment practices. The offset models were further adjusted for diagnosis year (linear, continuous), to account for any residual improvement in therapy over the years.
The PREDICT calibration was studied separately in CIMBA and BCAC studies as a merged cohorts. The calibration was assessed by splitting the patient data into four (primarily) or three (if low number of cases) risk categories based on PREDICT percentiles (0-20%ile, 20-50%ile, 50-80%ile, 80-100%ile or 0-30%ile, 30-70% ile, 70-100%ile) and plotting the expected breast cancer mortality against the observed breast cancer mortality in the quantiles. Optimally, the calibration should be examined by comparing the expected and observed number of events. However, in rightcensored, left-truncated data this was not possible, and the observed mortality and the respective cumulative hazard of statetransition was retrieved from Kaplan-Meier survival estimator, within each pooled dataset, after which the point-estimates and standard errors were pooled according to Rubin's rules. The expected mortality was calculated separately in each imputed dataset based on the average PREDICT score and the baseline cumulative hazard of breast cancer death. The cumulative hazard estimates were pooled to average and transformed to expected mortality. The difference between expected and observed mortality was considered significant, if the expected mortality point-estimate was outside the pooled 95% confidence interval of the observed mortality. The breast cancer-specific survival of patients in the PREDICT risk categories was visualized with Kaplan-Meier graphs, separately in those CIMBA and BCAC studies, that provided cause of death information for at least 80% of deceased patients.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
The consortia study participant phenotype data used in the current study are not publicly available due to protection of participant privacy and confidentiality, and data ownership belonging to the contributing institutions. But data can be made available in an anonymized form from the CIMBA and BCAC consortia on a reasonable request and after approval from the contributing studies. Requests for data can be made to the CIMBA and BCAC Data Access Coordination Committees (DACC; https://cimba.ccge.medschl.cam.ac.uk/projects/data-access-requests/; https:// bcac.ccge.medschl.cam.ac.uk/bcacdata/). The contact person for data access requests is Manjeet Bolla (mkh39@medschl.cam.ac.uk) Data Manager, Department of Public Health and Primary Care, University of Cambridge. The imputed datasets are available from the corresponding author upon the DACC approval.

CODE AVAILABILITY
All statistical analyses were performed within the R environment for statistical computing version 4.0.0, including libraries mice, survival, and CPE. Custom code, used for pooling the imputed results, calculating the PREDICT scores and baseline risk is available in GitHub (https://github.com/TaruMuranen/PREDICT_for_BRCA1-2).